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Sizes  of  Order  Statistical  Events 
By 

Daniel  Rudolph  and  J.  Michael  Steele 
1.  Introduction 

One  of  the  key  properties  of  independent,  identically 
distributed  continuous  random  variables  Y^»  1  £  i  <c  n,  is  that  the 
order  statistical  events  defined  by 

t“!T0(l)<Y0(2)<-<Y0(»)) 

have  the  same  probability  1/n!  for  any  permutation  a  :  [l,n]  ■+■  [l,n]. 
The  main  objective  of  this  paper  is  to  determine  the  extent  to  which 
this  property  is  retained  asymptotically  for  processes  which  are  only 
assumed  to  be  stationary  and  ergodic. 

To  set  the  problem  precisely,  we  suppose  that  is  a 

strictly  stationary,  ergodic  process  defined  on  the  probability  space 
(ft,  5,  P) .  We  will  also  use  the  representation  of  such  a  process  by 
X^fw)  =  f(T-i+1  <o),  1  <  i  <  •,  where  T  :  ft  ■*  SI  is  an  ergodic  measure 
preserving  transformation  and  f  :  0  -*•  R  is  a  measurable  map.  To  avoid 
inessential  messiness,  we  also  restrict  attention  to  processes  which 
satisfy  the  continuity  property 

(1.1)  P(Xi-Xj)  -  0  ,  i  *  J  . 

Our  approach  to  the  analysis  of  the  order  statistical  events 
is  motivated  by  the  Shannon-McMillan-Breiman  theorem,  and  particularly 
the  phrasing  of  that  result  in  terms  of  the  equlpartition  property 
([1,  p.  135],  [9,  p.  35  (6.3)]).  Loosely  speaking,  that  phrasing 
tells  one  in  terms  of  the  entropy  of  T  just  how  many  sets  of  a  certain 
type  are  needed  to  cover  most  of  ft. 


To  establish  a  comparable  result  for  the  order  statistical 
events,  we  let  Qn(F)  be  defined  for  any  F63  by 

Qn(F)  =  J  (d  :  X0  (<u)  <  Xff  j  (w)  <  . . .  <  Xa  ^  (w)  ,  for  some  w  €  F)  | 

Here  Isl  denotes  the  cardinality  of  the  set  S,  so  Q  (F)  is  equal  to 
*  n 

the  least  number  of  order  statistical  events 

A  ■  (oj  :  X_ v  (to)  <  X  (to)  <  ...  <  X  ,  .  (<o))  which  one  needs  to  cover  F. 
a  cun,) 

The  quantity  of  main  interest  is  now  defined  for  €  >  0  by 

(1.2)  Q*(€)  -  min  0  (ft\  E)  , 

n  E:P(E)<€ 

so  Q*(€)  is  the  least  number  of  which  will  cover  a  set  of 
probability  1-6. 

To  familiarize  (^(C) ,  we  note  that  if  the  are  i»i»d. 

* 

and  satisfy  (1.1),  then  Qr  is  equal  to  the  least  integer  greater  than 
(1  -  €)n! . 

r  I00 

In  this  particular  example  IXjJj^  is  a  process  with  infinite 

entropy,  and  Q*(0  is  near  its  a  priori  upper  bound.  One  intuitively 

expects  that  Q^(€)  should  be  of  smaller  order  than  n!  for  processes 

* 

with  finite  entropy.  We  show  more  precisely  that  in  that  case  (^(6) 
is,  in  fact,  exponentially  smaller. 

-  .QO 

Theorem  1.  For  any  stationary  ergodic  process  1X^,1  iml  which  has 
finite  entropy  and  satisfies  P(X1»X^)  =  0,  i  t  j,  there  is  for  any 
€  >  0  a  sequence  of  positive  reals  pn  tending  to  zero  for  which 

(1.3)  <£(€)  -  *n!*pn  *  for  a11  n  -  1  * 
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Before  giving  the  complementing  result,  two  comments  are  in 
order.  In  the  first  place  we  note  there  are  many  processes  satisfying 
the  hypotheses,  since  if  f  satisfies  P{<o  :  f <eo)  - y>  -  0  for  all  y€R, 
the  condition  P(X^“X^)  “0,  i  jl  j,  trivially  holds  for  and  any  mea— 
preserving  T.  Also,  transformations  of  finite  entropy  not  only 
abound  but  play  key  roles  in  such  distinct  subjects  as  statistical 
mechanics  and  the  metrical  theory  of  diophantine  approximation  (  [l]) . 

Second,  we  note  (1.3)  is  equivalent  to  saying  0*(6)  =  o(pnn!) 
for  each  p  >  0.  The  phrasing  of  Theorem  1  was  chosen  in  view  of  the 
next  result  which  makes  precise  the  sense  in  which  Theorem  1  is  best 
possible . 

Theorem  2.  for  any  positive  pn  which  tend  to  zero  there  is  a 
stationary,  ergodic  process  (Xj)*^  with  PCX^X^)  =0,  i  ^  j,  which 
has  zero  entropy,  and  which  satisfies 

<£(€)  >  (n.’)p" 

for  infinitely  many  n  and  any  €  <  1. 

The  preceding  theorem  is  easily  seen  to  be  a  consequence  of 
the  next  result  which  shows  that  the  underlying  T  plays  a  surprisingly 
small  role  in  determining  Qn(6)* 

Theorem  3.  Given  any  ergodic  measure  preserving  transformation  T 

on  a  non-atomic  probability  space  (Q,  3,  P),  apd  given  any  p  >0 

n 

tending  to  zero,  there  is  an  f  :  ft  -*■  [0,1]  which  satisfies 
(1.4)  P((o»:  f_1(«)-x))  -  0  ,  Vx€[0,l)  , 
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and 

<l-5)  Q*(S)  >  (n!)p” 

for  infinitely  many  n  and  any  €  <  1. 

In  the  next  section  we  give  the  proof  of  Theorem  1  as  a  conse¬ 
quence  of  a  counting  argument  and  the  application  of  the  Shannon- 
McMillan-Breiman  theorem  to  an  appropriately  chosen  partition. 

The  proof  of  Theorem  3  is  more  subtle  and  makes  use  of  a 
generalization  of  a  combinatorial  structure  known  as  de  Bruijn 
sequences . 

Since  the  construction  provides  a  technique  for  building  copies 
of  a  finite  sequence  of  independent  random  variables  inside  a  general 
stationary  process,  the  construction,  should  be  useful  in  a  variety  of 
problems . 

Finally,  in  the  fourth  section  a  brief  speculation  on  the 
theory  of  order  statistical  events  is  ventured. 

2.  The  Upper  Bound  Method 

For  any  measure  preserving  transformation  S  :  ft  ft  and  any 
partition  p  "  ^i^i*»l  f^e  8ets  given  by 

n-i 

(2.1)  n  {«  :  S  3  <co)  €P  ) 

j-0 

for  some  1  £  i^  <  s  will  be  called  the  n  -p  -  S  name  associated  with  the 
n-p  -  S  alias  (i^.i^ . . .  ,1^)  .  If  S  is  ergodic  and  has  entropy 
H(S)  <  a,  the  Shannon-McMlllan-Brelman  theorem  says  there  is  an 
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i\q  *  rig (6,  a,  S,  p)  such  thst  for  n  ^  n^  there  Is  a  collection  of  2 
of  the  n-p-S  names  whose  union  has  probability  at  least  l-€. 

Since  P(X^"Xj)  ■  0,  i  the  disjoint  sets  Pff  given  by  the 
permutations  of  {l,2,...,k}  by 

PC  "  :  *&(1)  <  *&(2)  <  •**  <  W 

have  union  ft  (except  for  a  set  of  measure  zero) .  The  partition 
p  “  {PQ}  can  be  related  usefully  to  the  possible  orderings  of  {x^}^^. 


Lemma  2.1.  For  any  n  - p 


-P-T*- 


name  A  we  have 


(^(A)  <  (nk)!/(k!)n  . 

It 

Proof.  First  consider  n-2.  The  2-p-T  name  A  has  an  associated 
sequence  (ij^ij) ;  and  i^  determines  the  ordering  of 

-  {X1(a)),X2(u),...,Xk<M)},  while  i2  determines  the  order  of 
R2  -  {Xk+1(«),Xk+2(w),...,X2k(a>)}.  To  count  the  possible  orderings  of 
R1  ^  R2*  We  not*  tke  8et  *1  d*terraine8  k  +  1  intervals 
(X(1),X(2j),...,(X(nj,»)  where  {x(i)}w  *re  the  order  statistics  of 
(Xj)!^.  Since  there  are  (^)  ways  of  putting  the  order  statistics  of 
(Xi}^k+i  into  the  k  +  1  Intervals,  we  have 

Q2k(A)  -  (kk)  "  <2k>!/<k!>2  * 

In  general,  we  see  for  R^  -  ^Xjtc+i,Xjief2*'  *  *  ,X(j+l)k*  *nd  0  -  3  K  n 

that  U  R.  determines  (t-fl)k  +  1  intervals  into  which  the  order 

J"0  (t  +  l)k 

statistics  of  Rt+1  can  be  placed  in  (  k  )  ways.  Making  the 

sequential  choices  we  have 
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Qnlc(A)  1  (2k>(kk)  ”•  (kk)  =  (nk) !  (k!)~n  , 

which  completes  the  lemma.  [] 

To  prove  Theorem  1  we  need  to  bound  the  number  of  n-p-Tk 
names  which  are  needed  to  cover  a  set  of  probability  1-6.  We  first 
note  that  any  nk  — T  — p  name  is  contained  in  some  n  —  Tk  —  p  name  because 
for  any  alias  one  has 


The  Shannon-McMillan-Breiman  theorem  applied  to  the  ergodic  transfor¬ 
mation  T  with  entropy  H(T)  <  O  <  «*>  says  there  is  a  collection  C  of 
2ank  of  the  nk-p-T  names  whose  union  contains  \E  with  P(E)  <  6  for 
all  n  >_  nQ  =  n^CC,  P) .  By  the  preceding  remark  this  also  implies 
there  is  a  collection  C'  of  2  of  the  n  —  p  —  T  names  with  the  same 
property. 

We  now  see 

E)  <  I  ^  Qnk(A)  <  2ank(nk):(k!)"n 
A6C 

where  the  last  inequality  follow  from  Lemma  2.1.  For  any  k  we  can 
write  n  -  nk  +  r  with  1  <  r  <  k  one  has 

Q*<6>  -Q(n+l)k  (€)  -  2a<n+1)k  ((n+l)k)  l  (k!)-n-1 
provided  n  n^. 

Since  (n  +  l)k!/m!  <_  ((n  +  l)k)k,  and  k!  ^  kk  e  k,  we  have 
(£(€)  <  m!  {2(a+1)m  ((n  +  l)k)k  (k/e)-k(n+1)) 
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and  the  fixed  integer  k  was  arbitrary,  so 

(£<€)  =  0(pm  ml) 

for  all  p  >  0.  The  implied  constant  depends  not  only  on  H(T)  but  on  T 
through  the  nQ  given  by  the  SMB  theorem.  As  noted  earlier,  this  last 
relation  is  sufficient  to  imply  Theorem  1. 

3.  The  Lower  Bound  Method 

The  first  lemma  required  for  Theorem  3  is  the  so-called  strong 
form  of  Rohlin’s  lemma  ([8,  p.  22])  which  provids  a  systematic  method 
for  applying  combinatorial  constructions  to  stationary  processes. 

Lemma  3.1.  Suppose  T  is  an  ergodic,  measure  preserving  transfor¬ 
mation  on  a  non-atomic  probability  space  (ft,  3?,  P) .  For  any  finite 
partition  . . . »Hg)  of  ft  and  for  any  real  £  >  0  and  integer  m 

there  is  an  E€3  with  the  following  properties 

(3.1)  E,  T  *  E,...,T  E  are  disjoint  , 

m-i 

(3.2)  P(UTE)>l-€  , 

i=0 

(3.3)  P((T_i  E)  flHj)  -  P(E)P(Hj)  ,  0  <  i  <  m  ,  1<  j<  s  . 

The  second  lemma  we  need  is  a  graph  theoretic  result  due  to  I.  J.  Good 
([5],  [6,  p.  95])  which  sharpens  a  well-known  result  of  Euler. 

Lemma  3.2.  If  G  is  a  connected  directed  graph,  and  i!  at  each 
point  of  G  there  are  the  same  number  of  arcs  going  out  as  coming  in. 


then  there  is  a  directed  cycle  in  G  that  goes  through  every  arc  of  G 
in  its  given  direction,  and  uses  no  arc  twice. 

As  an  application  of  Lemma  3.2,  we  will  obtain  the  existence 
of  what  can  be  called  s-ary  de  Bruijn  sequences.  To  introduce  these 
sequences,  we  recall  the  classic  result  of  de  Bruijn  which  says  the 
following : 

Given  a  positive  integer  n,  there  is  a  sequence  of  0's 
and  l's  of  length  N  =  2n,  say  a^^  ...aN>  such  that  the  n- 
tuples  a±ai+1  •  •  •  ai+n_1  are  a  complete  list  of  all  2n  of  the 
n-tuples  with  alphabet  {0,l}. 

Here  is  understood  to  follow  a^  etc.  in  the  cycle.  For  example, 
when  n=3,  the  cycle  (00010111)  contains  every  3-tuple  of  0's  and  l's 
exactly  once.  We  will  use  the  following  generalization  where  G  is  an 
alphabet  of  s  letters  and  8  is  the  set  of  all  ordered  k-tuples  of  G. 


L 

Lemma  3.3.  There  is  a  sequence  a,a_  ...  a„  with  N=s  of  the  ele- 

X  Z  N 

meats  of  G  such  that  each  element  of  8  occurs  exactly  once  in  the  set 

of  k-tuples  <ar+1>ar+2’-***ar+k^’  where  0  <  r  <  sk  and  at  =  a  if 

k  k 

t  >  s  and  t  -  s  =  u . 

Proof.  We  define  a  directed  graph  G  whose  edges  are  the  ordered 
(k-1) -tuples  formed  by  elements  of  C.  We  have  an  edge  from 
(b1b2  . . .  bk  l)  to  (bj , . . .  fb£_j)  provided  b2  =  bj ,  b^«=  b£ , . . .  ,bk_2  -  bkl 
and  bk_^  is  arbitrary.  Every  vertex  has  in-degree  and  out-degree 
equal  to  s,  so  Lemma  3.2  implies  there  is  a  cycle  which  traverses  the 
edges  of  G  and  uses  each  exactly  once.  From  such  a  cycle  the  sequence 
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of  a^  €  G  given  by  the  successive  bk_^  is  easily  checked  to  satisfy 
the  claim  of  the  lemma. 

The  proof  given  of  Lemma  3.3  is  only  a  mild  modification  of 
the  application  given  by  Good  [5]  and  which  Bondy  and  Murty  [2, 
p.  181]  relate  to  the  design  of  an  efficient  computer  drum.  A  com¬ 
pletely  different  algorithmic  proof  of  Lemma  3.3  was  developed  inde- 

/ 

pendently  in  recent  work  of  Fredricksen  and  Mariorana  [4]. 

We  now  proceed  to  prove  Theorem  3  by  applying  the  preceding 
lemmas  infinitely  many  times. 

We  suppose  now  that  (p^K^,  ^i^i=l  are  increas-n8 

sequences  of  positive  integers  and  {g  is  a  sequence  of  positive 

reals  decreasing  to  zero.  We  will  define  a  sequence  of  functions 
g 0(co)  =  0,  g1(<d),  g2(<o),...,gk(&>),...  where  each  gfc(ft))  will  be 
defined  via  pfc,  tk,  hfc,  €k»  and  the  preceding  (ft)) ,  0  £  j  <  k.  Also, 
we  should  remark  that  each  of  the  g^ca)  will  assume  only  finitely  many 
values. 

For  notational  convenience  we  temporarily  write  p  =  pk>  t  =  tk> 
h  =  hk>  and  €  =  €k-  We  define  G  by  (a  :  a  =  2  r,  €r  =  0  or  1, 

1  <  r  <  p),  and  note  by  Lemma  3.3  there  is  a  sequence  of  | G| =  2^ 
elements  of  G  which  form  a  cycle  in  which  each  ordered  t-tuples  with 

letters  from  G  appears  precisely  once  among  the  (ai+j»a£+2 . ai+t^ 

for  0  £  i  <  2*^. 

We  now  apply  Lemma  3.1  to  obtain  an  E  such  that  for 
votP  _v 

0  <  k  <  2  the  sets  T  E  are  disjoint  and  there  union  has  prob¬ 
ability  at  least  1-6.  For  the  finite  partion  11  we  take  the 
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partition  given  by  the  distinct  values  of  the  sum  Ek~*  g.(Ci))2  where 

Pj  *  3-^3  ^  k. 

We  now  define  gk(co)  =  a^  if  to  € T-i+1  for  1  <  i  <  h2tp  and 
1  =  i'  mod  2tP-  Finally ,  we  take  gk((j)  =  0  if  co  is  in  none  of  the 
T_i+1  E,  1  <  k  <  h2tp. 

The  whole  point  of  this  construction  is  that  now  by  setting 
&k  equal  to  the  union  of  T  1  E  with  0  i  <  (h-l)2tp  and  letting 
Pk^0  =  P(Aflfik)/P(f2k)  ,  we  see  that  the  random  variables  gk(w)  , 
gfc(T  1CJ)  , .  .  •  >gk(T  t+''"6i)  are  independent  in  the  probability  space 

(^k»  Pk’  ^  where  5k  =  Ae3^-  More  over,  these  random  vari¬ 

ables  are  also  independent  when  conditioned  on  the  cx-field  given  by 
the  partition  ii.  To  prove  these  assertions  one  only  has  to  note  that 
for  any  H  €  Jt  we  have  by  (3.1)  and  (3.3) 

— p,  —  p  — p 

P({gk(oi)  -  sQ2  ,  gk(T-10))  =  s_12  ....  ,gk(T“t+1ft))  =  s_t+12  k)  fl  H) 


-  (h-l)P(E)P(H)/(h-l)2tp  P(E)  =  2~tp  P(H)  . 


Finally,  we  are  able  to  define  f(co)  by  letting 
and  setting 


PJ 


co  —  q 

f(w)  =  Z  g.  (w)2  k  . 
k=l  K 

The  sum  representing  f(&))  converges  for  all  a),  and  one  finds  no  diffi¬ 
culty  in  checking  that  for  X^(co)  =  f(T  *+*Co)  we  have  P(X^  =  X^)  =  0  for 
all  i  /  j. 
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To  prove  Theorem  3  we  suppose  that  the  sequence  p  4-  0  and 

e  €  (0,1)  are  given,  and  we  will  proceed  to  show  that  Cpj”  ,  ,  {t  }"*  . 

i  i=l  i  i**l 

*hi^i=l’  and  ^€i^i=l  can  be  chosen  so  that  Qn(€)  >  n!  p"  for  infinitely 
many  n. 


-q 


For  the  intervals  I  =  [s  2  ^  (s  +  l)2  K  1)  with 

qk-l  * 

0  <  s  <  2  we  define  random  variables  V  as  the  number  of  elements 

s 

of  =  \i  :  C  Ig,  1  £  i  £  n) .  The  ordering  of  is  completely 

|  ^  s 

determined  by  the  ordering  of  {gk(T~  <o) } except  for  at  most  those 
•  s 

(0  €  O'  =  (t)  :  gk(T-1+1ci)  =  gk(T_:3+1u)  ,  for  some  0  £  i  <  j  £  n) .  Using 
the  P^-independence  of  the  {gfc(T  i+1w)}  1  £  i  <  n  for  n  £  t  and  the 
conditional  independence  given  )l  we  have 

P<(XsU><1W><‘--<  W™ 


-q 


k-l. 


<  £k  +  h^1  +  p({x0(1) < x0{2)  <  . .. < xo(n)J  nunnp 


i  €k  +  hk  +  Vtxo(i)  <  xo(2)  •=  •••<  W  ni*> 


-  ek  +  hk~  +  Pk(3-)  +  Pk(H)(nVs!>  1 

s 


Since  there  are  only  (^)  places  a  tie  can  take  place  and  the 

"qk 

P^-probability  of  any  such  tie  is  2  ,  we  have 

-1  -1  n  -qk  2  -qk 

P(3)  £  (l-ek-hk  )  Pk(3)  £  2 C 2> 2  K  <  n  2  k  . 

Also,  pk(H)  “  P(H)  for  all  HgB  by  (3.3);  so  summing  over  W  we  have 


p<XoU>  <  Xo(2)  «  ~  <  Xa<n))  i  2<'k"1(€k+\1)  +  "2  2  "k  +  "  <V> 
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.N-l 


*^k— 1  r-1 

Setting  r  =  2  we  have  E  __  V  =  n.  Hence,  II(v  !)  = 

s  u  s  s 

.  8 

Iir  (V  +1)  >_  T(- — -)r  by  the  convexity  of  logT(x),  [7,  p.  285].  We 
s  s 

begin  with  the  bounds 

<3.4>  (£<€)>  (1-0  t»a*  Pa<J(1)<X<j(2)<...<Xo(n))r1 

-P 

>  (l-6)n!{r(n+l)r(^-)r  +  r(n+l)  (2  k  n2  +  r  Gfc  +  r  h"1)}"1  . 

^k-1 

Since  r  =  2  is  fixed  (as  we  choose  Gk»  pk»  t^,  and  h^) ,  we  invoke 

Stirling’s  formula  to  obtain  T(n  +  1)T ((n  +  r)/r)  £_  (r  +  1)  for 

n  >  n„(r).  Since  p  -i-  0,  we  can  now  choose  t  =  t,>  n„  so  that 
—  0  n  R  —  u 

(3.5)  (r  +  l)-t  >  2p^(l-e)-1  . 

Finally,  we  choose  h^,  and  Pfc  so  that 

(3.6)  (r(t+l)(2  t2  +  rGfc  +  r  hjj1)}-1  >  2p^(l-G)~1  . 

By  the  elementary  inequality  l/(a  +  b)  (l/2)minfl/a,  1/b)  applied  to 
(3. A),  (3.5),  and  (3.6)  for  each  t  =  tfc,  k-1,2,...  we  have 

Q^(G)  >  p”(n!)  for  n  “  ti»t2» * •  •  * 

The  proof  of  Theorem  3  is  thus  complete.  [] 

A.  A  Brief  Speculation 

The  only  two  cases  where  one  has  a  reasonably  complete  under¬ 
standing  of  Q*(G)  are  the  case  of  continuous  i.i.d.  random  variables 
and  now  in  case  of  continuous  stationary  ergodic  processes  with 


finite  entropy.  These  two  cases  are  in  a  way  polar  opposites  of 
generality,  and  many  important  classes  of  processes  lie  in  between. 

Since  many  basic  probabilistic  events  are  simply  unions  of  the 
order  statistical  events,  it  would  seem  to  be  of  considerable  interest 
to  discover  those  cases  in  which  a  precise  understanding  of  Qn(0  can 
be  obtained. 
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